Skip to main content

Pattern finder

Overview

The pattern finder makes it easy to use a regular expression search in a more user-friendly manner.

How it works

Unlike the Regular expression selector, it doesn't require deep knowledge of regular expressions.

There are several use cases when Pattern finder selector could be useful.

Extracting predefined data types

Specify one of the predefined type extractors you can find all values of the selected type from the document and afterward filter them out with other selectors, e.g. Pick by index.

When you select one of these types you will be prompted to select a sample text to evaluate the pattern from.

If for some reason you've picked an incorrect location you can always repeat the selection using Select sample text button:

Extracting text using in line anchors

Pattern finder like Paragraph is a working horse of pdf2Data when you need to extract textual value from PDF when there is a static text (anchor) next to that value.

note

Labels like Customer #, Statement date :, Invoice: are static for all documents of a type and can be used as an anchor

Whenever the anchor is in line with the desired value, use Pattern finder to retrieve it. With the Add limiter button you can use the anchor text as a Prefix or a Suffix for the selector.

note

This selector looks for prefixes and suffixes in the PDF, and extracts everything in between.

Pattern finder accepts multiple values as prefixes and suffixes. If the no suffix key is used everything between the prefix values and the end of a line will be extracted With the no prefix key - from the start of the line to the suffix values

Example

this selector extracts Prices from the following strings:

SubTotal 18.123,23 AUD
Total 22.123,23 AUD
SubTotal 18.123,23
Total 22.123,23

Result overview

Resultant text will be presented in lines (see type of output in Picker selector).

important

The format and example of the actual result produced by the pdf2Data Engine is described in Recognition result specification.

Specification

To see more information about properties and expert usage visit specification page.